Goto

Collaborating Authors

 complex workflow


Rethinking Agentic Workflows: Evaluating Inference-Based Test-Time Scaling Strategies in Text2SQL Tasks

arXiv.org Artificial Intelligence

Large language models (LLMs) are increasingly powering Text-to-SQL (Text2SQL) systems, enabling non-expert users to query industrial databases using natural language. While test-time scaling strategies have shown promise in LLM-based solutions, their effectiveness in real-world applications, especially with the latest reasoning models, remains uncertain. In this work, we benchmark six lightweight, industry-oriented test-time scaling strategies and four LLMs, including two reasoning models, evaluating their performance on the BIRD Mini-Dev benchmark. Beyond standard accuracy metrics, we also report inference latency and token consumption, providing insights relevant for practical system deployment. Our findings reveal that Divide-and-Conquer prompting and few-shot demonstrations consistently enhance performance for both general-purpose and reasoning-focused LLMs. However, introducing additional workflow steps yields mixed results, and base model selection plays a critical role. This work sheds light on the practical trade-offs between accuracy, efficiency, and complexity when deploying Text2SQL systems.


From Natural Language Instructions to Complex Processes: Issues in Chaining Trigger Action Rules

arXiv.org Artificial Intelligence

Automation services for complex business processes usually require a high level of information technology literacy. There is a strong demand for a smartly assisted process automation (IPA: intelligent process automation) service that enables even general users to easily use advanced automation. A natural language interface for such automation is expected as an elemental technology for the IPA realization. The workflow targeted by IPA is generally composed of a combination of multiple tasks. However, semantic parsing, one of the natural language processing methods, for such complex workflows has not yet been fully studied. The reasons are that (1) the formal expression and grammar of the workflow required for semantic analysis have not been sufficiently examined and (2) the dataset of the workflow formal expression with its corresponding natural language description required for learning workflow semantics did not exist. This paper defines a new grammar for complex workflows with chaining machine-executable meaning representations for semantic parsing. The representations are at a high abstraction level. Additionally, an approach to creating datasets is proposed based on this grammar.


2020: The year the office finds its voice?

#artificialintelligence

While voice-based digital assistants such as Amazon Alexa, Apple Siri and Google Assistant are becoming increasingly common at home – and smartphones and wearables can be used handsfree via speech – the use of voice in the workplace is just getting started. That's likely to change in 2020 and beyond. More efficient employees, "smarter" voice-based assistants, easier ways of completing routine tasks and a digital experience in the office that matches what's used at home. A survey by 451 Research in 2019 indicated that voice UIs and digital assistants are among the most disruptive technologies for enterprises (IoT and AI are the top two), with four in 10 respondents planning to adopt voice technology within 24 months. "I expect 2020 will be the year when voice user interfaces will become prevalent in the workplace," said Raúl Castañón-Martínez, a senior analyst at 451 Research.


Crowdsourcing Complex Workflows under Budget Constraints

AAAI Conferences

We consider the problem of task allocation in crowdsourcing systems with multiple complex workflows, each of which consists of a set of inter-dependent micro-tasks.We propose Budgeteer, an algorithm to solve this problem under a budget constraint. In particular, our algorithm first calculates an efficient way to allocate budget to each workflow. It then determines the number of inter-dependent micro-tasks and the price to pay for each task within each workflow, given the corresponding budget constraints. We empirically evaluate it on a well-known crowdsourcing-based text correction workflow using Amazon Mechanical Turk, and show that Budgeteer can achieve similar levels of accuracy to current benchmarks, but is on average 45 % cheaper.